Search CTRL + K

SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations

Xingguang Zhong, Yue Pan, Jens Behley, Cyrill Stachniss

2023 IEEE International Conference on Robotics and Automation (ICRA)

10.1109/ICRA48891.2023.10160907




Introduction


Introduction

  • Fist work aims construct maps based on point cloud as input
  • Low memory cost, high speed, high performance

Method

  • Construct coodinate world based on the first frame
  • Each Level has a Hash Table to store the feature vectors of node

Difficulty

  • Point Sampling
  • Incremental Mapping without forgetting


Training pairs and Loss Function

  • Property of Lidar Scene : Sparse, noisy, get the true SDF is very difficult.
    • Directly obtain training pairs by sampling points along the ray and directly use the signed distance from sampled point to the beam endpoint as the signed distance of the point and the underlying surface
  • How to solve the accurancy problem?
  • For SDF-based mapping, the regions of interst are the values close to zero as they define the surfaces.
    • Sampling point closer to the endpoint should have a higher impact as the precise SDF value far from a surface has a very little impact.
  • Insread of using L2 Loss, Using BCE(Binary Cross Entropy) Loss, The SDF value should be processed by a sigmoid function
    • Lbce=lilog(oi)+(1li)log(1oi)
    • oi=Sigmoid(fθ(xi)),li=Sigmoid(di)
  • Sampling Policy :
    • Sampling Nf points in the fress space and Ns points inside the truncation band ±3σ around the surface

Why Sigmoid
Loss Function

Eikonal Loss

Leik=(fθ(xi)xi1)2

Incremental Mapping Without Forgetting

!iidfinal.png

  • We can only obtain partial observations of the environment at each frame (T0,T1 and T2). During the incremental mapping, we use the data capture from Area A0 to optimize feature V0,V1,V2,V3, after the training converge, V0,V1,V2,V3 will have an accurate geometric representation of A0. However, if we move forward and use the data from frame T1 to train and update V0,V1,V2,V3, the network will only focus on reducing the loss generated in A1 and does not care about the performance in A0 anymore. This may lead to a decline in the reconstruction accuracy in A0. Same problem happens from T1 to T2
Regularization term

Lr=iAΩi(θitθi)2

Update the importance weright as the sensitivity of the loss of previous data to a parameter change suggested in previous incremental learning research:

Ωi=min(Ωi+k=1NLbce(xk,lk)θi,Ωm)

experiment

!qualitative_mai_v3.png
A comparison of different methods on the MaiCity dataset. The first row shows the reconstructed mesh and a tree is highlighted in the black box. The second row shows the error map of the reconstruction overlaid on the ground truth mesh, where the blue to red colormap indicates the signed reconstruction error from -5 cm to +5 cm. (From left to right: Shine,(TSDF-based Method)Voxblox,(TSDF-based)VDF Fusion, (Possion-based Surface Reconstruction)Puma, Shine+DR(With Diffential Rendering method))

experiment on MaiCity dataset and Newer College dataset


Comparison of map memory efficiency

A comparison of the incremental mapping results with and without regularization